90 research outputs found

    Machine Learning Techniques for Assistive Robotics

    Get PDF
    Assistive robots are a category of robots that share their area of work and interact with humans [...

    Accurate and efficient 3D hand pose regression for robot hand teleoperation using a monocular RGB camera

    Get PDF
    In this paper, we present a novel deep learning-based architecture, which is under the scope of expert and intelligent systems, to perform accurate real-time tridimensional hand pose estimation using a single RGB frame as an input, so there is no need to use multiple cameras or points of view, or RGB-D devices. The proposed pipeline is composed of two convolutional neural network architectures. The first one is in charge of detecting the hand in the image. The second one is able to accurately infer the tridimensional position of the joints retrieving, thus, the full hand pose. To do this, we captured our own large-scale dataset composed of images of hands and the corresponding 3D joints annotations. The proposal achieved a 3D hand pose mean error of below 5 mm on both the proposed dataset and Stereo Hand Pose Tracking Benchmark, which is a public dataset. Our method also outperforms the state-of-the-art methods. We also demonstrate in this paper the application of the proposal to perform a robotic hand teleoperation with high success.This work has been supported by the Spanish Government TIN2016-76515R Grant, supported with Feder funds. This work has also been supported by a Spanish grant for PhD studies ACIF/2017/24

    Interactive 3D object recognition pipeline on mobile GPGPU computing platforms using low-cost RGB-D sensors

    Get PDF
    We propose the implementation of a 3D object recognition system which will be optimized to operate under demanding time constraints. The system must be robust so that objects can be recognized properly in poor light conditions and cluttered scenes with significant levels of occlusion. An important requirement must be met: The system must exhibit a reasonable performance running on a low power consumption mobile GPU computing platform (NVIDIA Jetson TK1) so that it can be integrated in mobile robotics systems, ambient intelligence or ambient-assisted living applications. The acquisition system is based on the use of color and depth (RGB-D) data streams provided by low-cost 3D sensors like Microsoft Kinect or PrimeSense Carmine. The resulting system is able to recognize objects in a scene in less than 7 seconds, offering an interactive frame rate and thus allowing its deployment on a mobile robotic platform. Because of that, the system has many possible applications, ranging from mobile robot navigation and semantic scene labeling to human–computer interaction systems based on visual information. A video showing the proposed system while performing online object recognition in various scenes is available on our project website (http://www.dtic.ua.es/~agarcia/3dobjrecog-jetsontk1/)

    Geometric 3D point cloud compression

    Get PDF
    The use of 3D data in mobile robotics applications provides valuable information about the robot’s environment but usually the huge amount of 3D information is unmanageable by the robot storage and computing capabilities. A data compression is necessary to store and manage this information but preserving as much information as possible. In this paper, we propose a 3D lossy compression system based on plane extraction which represent the points of each scene plane as a Delaunay triangulation and a set of points/area information. The compression system can be customized to achieve different data compression or accuracy ratios. It also supports a color segmentation stage to preserve original scene color information and provides a realistic scene reconstruction. The design of the method provides a fast scene reconstruction useful for further visualization or processing tasks.This work has been supported by the Spanish Government DPI2013-40534-R grant

    3D Maps Representation Using GNG

    Get PDF
    Current RGB-D sensors provide a big amount of valuable information for mobile robotics tasks like 3D map reconstruction, but the storage and processing of the incremental data provided by the different sensors through time quickly become unmanageable. In this work, we focus on 3D maps representation and propose the use of the Growing Neural Gas (GNG) network as a model to represent 3D input data. GNG method is able to represent the input data with a desired amount of neurons or resolution while preserving the topology of the input space. Experiments show how GNG method yields a better input space adaptation than other state-of-the-art 3D map representation methods.This work was partially funded by the Spanish Government DPI2013-40534-R grant

    TactileGCN: A Graph Convolutional Network for Predicting Grasp Stability with Tactile Sensors

    Get PDF
    Tactile sensors provide useful contact data during the interaction with an object which can be used to accurately learn to determine the stability of a grasp. Most of the works in the literature represented tactile readings as plain feature vectors or matrix-like tactile images, using them to train machine learning models. In this work, we explore an alternative way of exploiting tactile information to predict grasp stability by leveraging graph-like representations of tactile data, which preserve the actual spatial arrangement of the sensor's taxels and their locality. In experimentation, we trained a Graph Neural Network to binary classify grasps as stable or slippery ones. To train such network and prove its predictive capabilities for the problem at hand, we captured a novel dataset of approximately 5000 three-fingered grasps across 41 objects for training and 1000 grasps with 10 unknown objects for testing. Our experiments prove that this novel approach can be effectively used to predict grasp stability

    Multi-sensor 3D object dataset for object recognition with fullpose estimation

    Get PDF
    A new dataset for 3D object recognition using the new high-resolution Kinect V2sensor and some other popular low-cost devices like Prime Sense Carmine. Since most already existing datasets for3D object recognition lack some features such as 3D pose information about objects in the scene, per pixel segmentation or level of occlusion, we propose a new one combining all this information in a single dataset that can be used to validate existing and new 3D object recognition algorithms. Moreover, with the advent of the new KinectV2 sensor we are able to provide high-resolution data for RGB and depth information using a single sensor, whereas other datasets had to combine multiple sensors. In addition, we will also provide semiautomatic segmentation and semantic labels about the different parts of the objects so that the dataset could be used for testing robot grasping and scene labeling systems as well as for object recognitionThis work was partially funded by the Spanish Government DPI2013-40534-R Grant. This work has also been funded by the grant ‘‘Ayudas para Estudios de Máster e Iniciación a la Investigación’’ from the University of Alicante

    Pedestrian Movement Direction Recognition Using Convolutional Neural Networks

    Get PDF
    Pedestrian movement direction recognition is an important factor in autonomous driver assistance and security surveillance systems. Pedestrians are the most crucial and fragile moving objects in streets, roads, and events, where thousands of people may gather on a regular basis. People flow analysis on zebra crossings and in shopping centers or events such as demonstrations are a key element to improve safety and to enable autonomous cars to drive in real life environments. This paper focuses on deep learning techniques such as convolutional neural networks (CNN) to achieve a reliable detection of pedestrians moving in a particular direction. We propose a CNN-based technique that leverages current pedestrian detection techniques (histograms of oriented gradients-linSVM) to generate a sum of subtracted frames (flow estimation around the detected pedestrian), which are used as an input for the proposed modified versions of various state-of-the-art CNN networks, such as AlexNet, GoogleNet, and ResNet. Moreover, we have also created a new data set for this purpose, and analyzed the importance of training in a known data set for the neural networks to achieve reliable results.This work was supported by the Feder funds, Spanish Government through the COMBAHO Project, under Grant TIN2016-76515-R, and in part by the University of Alicante Project under Grant GRE16-19

    A New Dataset and Performance Evaluation of a Region-Based CNN for Urban Object Detection

    Get PDF
    In recent years, we have seen a large growth in the number of applications which use deep learning-based object detectors. Autonomous driving assistance systems (ADAS) are one of the areas where they have the most impact. This work presents a novel study evaluating a state-of-the-art technique for urban object detection and localization. In particular, we investigated the performance of the Faster R-CNN method to detect and localize urban objects in a variety of outdoor urban videos involving pedestrians, cars, bicycles and other objects moving in the scene (urban driving). We propose a new dataset that is used for benchmarking the accuracy of a real-time object detector (Faster R-CNN). Part of the data was collected using an HD camera mounted on a vehicle. Furthermore, some of the data is weakly annotated so it can be used for testing weakly supervised learning techniques. There already exist urban object datasets, but none of them include all the essential urban objects. We carried out extensive experiments demonstrating the effectiveness of the baseline approach. Additionally, we propose an R-CNN plus tracking technique to accelerate the process of real-time urban object detection.This work has been partially funded by the Spanish Government TIN2016-76515-R grant for the COMBAHO project, supported with Feder funds. It has also been supported by the University of Alicante project GRE16-19
    • …
    corecore